Binary salp swarm algorithm for discounted {0-1} knapsack problem

While the classical knapsack problem has been the object to be solved by optimization algorithm proposals for many years, another version of this problem, discounted {0-1} knapsack problem, is gaining a lot of attention recently. The original knapsack problem requires selecting specific items from an item set to maximize the total benefit while ensuring that the total weight does not exceed the knapsack capacity. Meanwhile, discounted {0-1} knapsack problem has more stringent requirements in which items are divided into groups, and only up to one item from a particular group can be selected. This constraint, which does not exist in the original knapsack problem, makes discounted {0-1} knapsack problem even more challenging. In this paper, we propose a new algorithm based on salp swarm algorithm in the form of four different variants to resolve the discounted {0-1} knapsack problem. In addition, we also make use of an effective data modeling mechanism and a greedy repair operator that helps overcome local optima when finding the global optimal solution. Experimental and statistical results show that our algorithm is superior to currently available algorithms in terms of solution quality, convergence, and other statistical criteria.


Introduction
Mathematical problems are not merely theoretical abstractions. Many of them are the mappings of real-life problems, and some have broad and practical applications. Hence, solving those problems becomes more urgent, and we can recognize the efforts of researchers in this area for many decades. Knapsack problem (KP) [1] is a typical example of such problems. In particular, KP is a classical combinatorial optimization problem in which we have a given set of items, and each item is coupled with a weight and a profit value.
The knapsack problem has vast practical applications in many areas. These applications include but are not limited to computer memory management, facilities management, energy consumption optimization, adaptive multimedia systems, resource allocation, logistics, encryption, cryptography, etc. All these seemingly unrelated problems meet in a common point: there is a limited resource and the utilization of that resource needs to be optimized.
To resolve the KP, we have to determine an item subset with the maximum sum profit while the total weight of the selected items is still less than or equal to a predetermined they propose a partitioning scheme to divide the original DKP01 into sub-problems to reduce calculation complexity, and utilize dynamic programming to solve them. Additionally, [6] proposes an exact algorithm that tries to minimize the total cost with a predetermined sum value to solve DKP01. Then, based on this algorithm, three approximate algorithms are introduced. Evolutionary and swarm intelligence-based computation are also applied in solving DKP01. The authors of [7] propose two mathematical models for DKP01 and two genetic algorithm-based algorithms, FirEGA and SecEGA, to resolve the problem. In [8], the authors propose two evolutionary operators called global exploration operator (R-GEO) and local development operator (R-LDO) to design a ring theory-based evolutionary algorithm which is used to solve the DKP01. While the two operators rely on ring theory, the evolutionary algorithm is based on the flower pollination algorithm [9]. The authors of [10] also nominate a multi-strategy algorithm for DKP01 on the basis of monarch butterfly optimization (MBO) [11]. In this study, the monarch butterfly population was separated into two sub-populations. The positions of monarch butterfly individuals in the first sub-population are handled by a neighborhood mutation-based crowding operator, which replaces the original MBO migration operator. Moreover, in [12], the application of moth search (MS) for DKP01 is investigated. First, the impacts of the Lévy flights operator and the fly straightly operator on basic MS are evaluated. Then, nine MS-based algorithms are developed using a global-best harmony search (GHS)-based mutation operator. Another contribution in nature-inspired optimization algorithm application, [13], introduces a discrete hybrid teaching-learning-based optimization algorithm (HTLBO) to resolve DKP01. A quaternary code is introduced to represent a DKP01 solution, and the individuals are modeled by double coding. The Learner's learning strategy is improved to expand the discovery capabilities of HTLBO, while self-learning is implemented to balance exploration and exploitation. Two sorts of crossover are also designed to strengthen the effectiveness of global search in this algorithm.
In a recently published work [14], Truong has developed a binary version of the famous Particle Swarm Optimization algorithm [15] to solve the DKP01 problem. In another publication [16], moth-flame optimization [17] is used to solve this problem. Most recently, an improved version of the Harris Hawks Optimization (HHO) algorithm [18] is proposed in [19]. Although the HHO algorithm has a pretty good balance between exploration and exploitation, the authors of this paper have suggested tweaks that target the attack phase of Harris hawks using opposition-based learning (OBL) strategy to increase the diversity in the search process. The main idea of OBL is to compare the fitness values of the current solution and its opposite case and then choose the better solution to include in the next generation. Additionally, the prey escape energy value, which was originally designed to reduce linearly, has also been redefined to reduce logarithmically non-linearly, making the transition between exploration and exploitation smoother. The authors also introduce a random unscented sigma point mutation mechanism to help HHO converge more quickly to the best solution the algorithm can achieve. Besides solving traditional benchmark functions (CEC2017 and CEC2020) and engineering problems, the resulting algorithm is also used to solve the DKP01 problem in selected data sets. However, the test results show that the existing DKP01 test instances are not simple, and this algorithm has not achieved very good results, which also means that there is still a lot of space for other solutions in the future.
Besides taking advantage of classical optimization algorithms, new optimization algorithms are also regularly introduced and open up new directions in solving optimization problems. An example for this is [20], where the authors presented two variants of a widely accepted swarm intelligence-based optimization algorithm, the single objective salp swarm algorithm (SSA) and the multi-objective salp swarm algorithm (MSSA). The main motivation of SSSA and MSSA is the swarming conduct of salps when exploring and rummaging for food in the seas. Test results on various data sets show that the SSA is able to improve the initial arbitrary solutions and converge towards the ideal one. More details on SSA will be given in the next section of this paper.
Despite being a relatively new algorithm, SSA has been cited in several scientific works across various research fields. In [21], the authors develop a binary version of SSA utilizing eight transformation functions and a crossover operator instead of the basic one which the original SSA provides. In [22], to study the optimal connections between switches and controllers and the optimal number of deployed controllers in large-scale software-defined networks (SDN) [23], the authors propose an optimization algorithm based on SSA using chaotic maps.
In an effort to solve the feature selection problem, [24] introduces another chaotic SSA algorithm and integrates it with a K-nearest neighbor classifier. Their solution is also proved to be efficient in tackling the local optima stagnation issue as well as improving the convergence behavior of the original SSA algorithm. [25] implements opposition-based learning in the initialization phase of SSA to enhance its population diversity. Moreover, local search algorithm (LSA) is also used in this work to improve exploitation performance. The authors of [26] propose a binary SSA using a modified arctan transformation. In [27], SSA is enhanced by balancing the exploration and exploitation process. [28] extends the original SSA by implementing multiple independent salp chains and applies them for maximum power point tracking (MPPT) of photovoltaic systems under partial shading conditions. [29] uses space transformation search (STS) [30] to improve the performance of SSA, and the resulted algorithm is deployed to train a multi-layer feed-forward network. A recent publication, [31], proposes new mutation operators to balance the exploration and exploitation phases of SSA. The authors of [32] present the solitary and colonial reproduction phase of salp in emended salp swarm algorithm (ESSA), which is used to resolve the economic load dispatch problem in a multi-objective framework. In [33], composite mutation strategy (CMS) and restart strategy (RS) are integrated into SSA to boost exploitation and exploration trends of SSA as well as aid salps in avoiding local optimum.
Though numerous studies have referred to SSA, to the best of our knowledge, this paper is the first to utilize SSA in resolving DKP01. Although the algorithms for DKP01 mentioned above have achieved encouraging results, parts of the solutions chosen by them are not very reasonable. They can be improved, such as the classical solution representation, which is not an ideal choice and will be replaced by the scheme in this paper. Besides, we intend to combine the power of SSA with the application of a greedy repair operator for local optimization as well as to address the weakness of SSA when its solutions are easily stuck at the local optimal point and can't get out to try other candidate solutions during the global optimization process. In detail, the contributions of this work include: • A novel binary salp swarm algorithm (BSSA) with four binary transformation functions and a new solution presentation scheme to solve the discounted {0-1} knapsack problem.
• A combination with a minimal encoding scheme whose binary solution vector length is 2n (in comparison to the length of 3n of the original DKP01 that is used in many previous papers). While providing enhancements in calculation speed and reducing the complexity, this scheme automatically satisfies the constraint of the DKP01 stated in Eq 2.
• The use of a repair operator on the positions of the salps during salp chain movement towards the food source to avoid local optima and enhance calculation effectiveness.
The rest of this paper is as follows. The next section gives an introduction to the salp swarm algorithm (SSA), which is the basis for our algorithm. The section after that details our proposed binary SSA for DKP01. Then come the simulation results and discussion of our algorithm's performance in comparison to those of other existing algorithms. Finally, the Conclusions section will conclude the paper.

Salp swarm algorithm
Introduced by [20], SSA has received much attention recently due to its simplicity, effectiveness, as well as adaptability to various optimization problems. This section will give details on this algorithm.
A salp, which can be found generally in deep seas but sometimes near the surface, is a barrel-formed, planktic tunicate that moves by contracting, thereby pushing water through its jelly-like body. One of the most interesting activities of a salp population is forming a salp chain, which may increase the swarm effectiveness in traveling and foraging. SSA is an effort to facsimile the swarming behavior of salps in oceans.
In SSA, individuals in a salp population are classified into two categories: the leader, which is the salp at the head of the chain, and the followers. The position of a given salp is modeled as an n-dimensional search space, in which n is the number of variables of the problem to be solved. As a result, all the position vectors of the salp population form a 2-dimensional matrix named pos. The food source of the swarm is modeled as the target F in the search space.
The position of the leader is updated utilizing the below condition: where pos 1 j represents the coordinates of the leader in the jth dimension, F j depicts the position of the target F in the jth dimension, ub j is the upper bound of the jth dimension, and lb j is the lower bound of the jth dimension. Additionally, c 1 is a number generated using the following rule: where k is the current iteration and K is the maximum iteration. Meanwhile, c 2 and c 3 are randomized in the range [0, 1].
The positions of the followers are manipulated using the below equation: The general idea of SSA is simple: the leader moves towards the target (food source), and the followers trail the leader. In optimization problems, while the global optimum should be the target, there is no such thing that exists. To resolve this, the best solution obtained at a given time is considered the global optimum, and the salps should head towards it. The pseudo-code of SSA is shown in Algorithm 1. Update the position of the current salp by Eq 7 Amend the salps based on the upper and lower bounds of variables return F Next, we will do some analysis to clarify how SSA works. From the given information, the interesting part is how the position elements of the salps are manipulated. Firstly, it is easy to notice that the lower bound lb and upper bound ub vectors are critical in keeping the position elements of the leading salp be in the valid range, which will, in turn, lead the followers on the right path. For simplicity, assume that all items in lb has the same value of 1, and all items in ub has the same value of 10. Thus, from Eq 5 and since c 2 is randomly generated in [0, 1], the position of the leader in the jth dimension is specified by: The values of c 1 are in the range of [0, 2]. Assume that the maximum iteration K = 100, the curve formed by values of c 1 is illustrated in Fig 1. It can be seen that the closer k gets to K, the smaller c 1 is, and the less chance that the corresponding position element of the leading salp can change significantly. Although this is consistent with the nature of global search in evolutionary computation, where searches in the early stage should cover a broader scope than those in the later stage, this also represents the risk that the leading salp can be easily stuck at a local optimal point. This becomes even more serious when a position element of a following salp is simply the average between the value of the new position element of the preceding salp and the value of its own position element in the previous iteration. This means that, when the leading salp gets stuck, it is unlikely that the salps that follow it have a way to assist it in coming up with solutions to get out of the local optimum.
To resolve this problem and due to the fact that SSA has no mechanism to deal with DKP01 constraints, we decide to implement a repair operator which will help the solution given by SSA avoid local optima, and improve its fitness. Details of this operator and other proposed algorithms are given in the next section. Note that although there exists a multi-objective version of SSA, discussion of it is beyond the scope of this paper.

Proposed binary salp swarm algorithm for DKP01
The original SSA needs many amendments to solve the DKP01. This section will provide details on the solutions that we propose.

Binary transformation functions
To operate in a binary search space, binary transformation functions are necessary so that the related parameters should take the value 0 or 1 only. Sigmoid function [34,35] is widely accepted as a means of transferring real values into probability. In this paper, we utilize four Sshaped sigmoid transformation functions as follows: Plots drawn from the outputs of functions detailed in Eqs 9-12 are shown in Fig 2. Using these four functions, we propose a novel Binary SSA (BSSA) optimizer for DKP01 which has four variants being BSSA1, BSSA2, BSSA3, and BSSA4, respectively. In particular, BSSA1 will take advantage of Sig 1 (�), BSSA2 utilizes Sig 2 (�), BSSA3 implements Sig 3 (�), and BSSA4 makes use of Sig 4 (�). Our new algorithm will also use modified versions of Eqs 5 and 7 which will use the transformation functions given in Eqs 9-12.
Firstly, we define z 1 and z 2 as:

Solution presentation
The traditional approach to encode a solution of a {0, 1} optimization problem is using a binary vector whose length is the number of dimensions of the search space: Each three-bit binary number represents three items in a group. If a bit is set to 1, the item at the related spot is selected. Otherwise, value 0 at a given bit means that the item is not chosen to be in the knapsack.
In this paper, we use a binary encoding scheme as shown in Table 1, with two-bit binary numbers used for solution presentation, as described in Eq 18.
For binary solutions of length 3n, the search space has 2 3n possible cases, while a binary solution of length 2n has a much smaller search space: 2 2n potential cases. The representation of the 2n solution allows the search for candidate solutions in a much smaller space. Besides, the representation of the 2n solution also helps them to automatically satisfy the constraint specified in Eq 2, which we do not have when using the 3n representation. 3n solutions need to be checked to ensure that a specific three-bit binary number does not violate Eq 2. In the worst case, when a violation occurs, another value needs to be assigned to that number. These actions are not necessary with binary solutions of length 2n.
When considering the constraint of Eq 2, all random solutions in the 2n search space are possible solutions, while the 3n search space contains non-viable solutions. Therefore, representing a solution of length 2n reduces the computation time.

Repair operator
To deal with the restriction in Eq 3 and enhance the solution, we use a repair operator based on the functions used in [6,14]. With n groups, we have a total of 3n candidate items to be put into the knapsack, including the combined items. Note that when the mentioned functions only support the 3n solution, we design our operator so that the 2n solution is supported while 3n items are still in consideration.
In short, the repair operator does the job of manipulating the selected set based on the value-to-cost ratio values v i, j /c i, j , (i 2 {0, 1, . . ., n − 1}, j 2 {1, 2, 3}) to reduce CPU usage and improve local optimum avoidance capability.
Since choosing which items to remove from or to add to the knapsack is not simply a matter of prioritizing combined items (a particular combined item is not necessarily better than a single item), we decide to sort all items and put them into a deterministic process. Thus, before the repair operator execution, all the items, including the combined ones, are sorted decreasing by the value-to-cost ratio values. The indexes of the items in this order are kept in the ID vector of length 3n. Using the ID vector, the items with more priority will be processed first. Then, the steps which this operator will do are as follows.
The repair operator has two phases: the repair and optimization phases. The repair phase is designed to fix a solution to become a feasible one from an impracticable state. Meanwhile, the optimization phase will enhance the fitness of a viable solution. If the current total cost is greater than C, the repair phase will remove items from the knapsack until the condition given The first item in the group is selected 3 10 The second item in the group is selected 4 11 The third item in the group is selected https://doi.org/10.1371/journal.pone.0266537.t001 by Eq 3 is met. After that, the optimization phase adds items to the knapsack provided that the total cost does not exceed C.
The inputs of the operator include the solution Y of length 2n, the cost vector of length 3n, the index vector ID, and the knapsack capacity C.
Algorithm 2 shows the pseudo-code of the related repair operator. Note that the operator's computational complexity is O(n).
To sum up, the pseudo-code of our proposed BSSA algorithms for DKP01 is detailed in Algorithm 3.

Results and discussion
The simulations used for this paper are for these goals: • Compare four variants of our proposed BSSA to determine the best ones for DKP01. This is an internal test only. Thus, only the algorithms proposed by this paper are included in related tests, diagrams, and tables.
• Then, our best BSSA variants for DKP01 will be compared to selected algorithms proposed by other scientific works to see which one performs best in various aspects through statistical calculations. The chosen algorithms are the best we could find in recent publications.
Firstly, we choose two revised versions of genetic algorithm (GA) [36] and particle swarm optimization (PSO) [15] to include in the comparison. In the case of the GA variant for DKP01, we choose FirEGA, which is introduced in [7]. The PSO version to be tested is the best one from [14], BPSO8. We also include the results of MS1, which is designed based on the moth search algorithm and is the best algorithm for DKP01 proposed in [12], and MMBO, a multi-strategy monarch butterfly optimization algorithm for DKP01 introduced by [10]. The authors of this paper propose many variants of their algorithm, and we choose the best one of them. In [19], the authors have tested their algorithm on selected instances of the DKP01 problem. Since this is a promising algorithm and a recently published work, we also decided to include the experimental results of this algorithm for comparison.
The parameters used for testing are shown in Table 2. For a fair comparison, we set the population sizes (the number of particles in case of the PSO variant) at the same value, 50. Furthermore, the maximum iterations of all algorithms are set to the number of dimensions of DKP01, 2n, for the same reason. We use 40 DKP01 instances proposed by [7] and available at https://www.doi.org/10.6084/ m9.figshare.19416857.v2 to test all algorithms. They include 10 strongly correlated instances (SDKP1-SDKP10), 10 inverse strongly correlated instances (IDKP1-IDKP10), 10 uncorrelated instances (UDKP1-UDKP10), and 10 weakly correlated instances (WDKP1-WDKP10). The correlation is considered strong when cost and value are closely related and highly dependent on each other. Contrarily, the correlation is considered weak when cost and value are loosely related. The number of items in each instance is 3n, n 2 {100, 200, . . ., 1000}. The mentioned instances are also used in [37].
All related algorithms, coded on MATLAB R2018a, run on an ASUS laptop, equipped with an Intel Core i5-8250u 1.6 GHz CPU, 8 GB DDR3 SDRAM, and uses Microsoft Windows 10 as the operating system.

Convergence behaviour
Our first concern is the convergence speed towards the optimal solution of the algorithms. We recorded the degree of convergence by running four versions of our proposed algorithm on different data set files, each algorithm being run once. After each iteration, the resulting bestso-far total value is saved. This set of values is fed into a graph showing the convergence behaviour.
In fact, the four data set types of the DKP01 problem have quite different characteristics. However, we found that the convergence behavior of these algorithms on problems of different sizes on the same data set type is not significantly different. Therefore, we decided to choose two typical cases to describe the convergence of the algorithms for each type of data set. Fig 3  summarizes the converging curves for these types of data sets.
In the test with all data sets, BSSA1 and BSSA2 proved their superiority over the other two versions of the algorithm. They achieve better solution quality and higher fitness value from the first iterations. The early convergence behaviour also suggests that BSSA1 and BSSA2 can be further improved to take advantage of later iterations. Fig 3 also shows that while BSSA3 and BSSA4 can perform closer to the performances of BSSA1 and BSSA2 in case of smaller problems, it can be concluded that BSSA3 and BSSA4 are not appropriate to be used for larger problems.

Stability and solution quality
This subsection focuses on examining the stability and quality of the solutions returned by our proposed algorithms. For demonstration, we use box plots whose data are the best values achieved after each algorithm run. To obtain a series of best values that will be used to create the box plot, we run each algorithm 30 times and get 30 best results.
In descriptive statistics, a box plot [38] is a graphical tool to demonstrate the data distribution using a five-number summary of that data set. Those five numbers are the minimum, the first quartile, the median, the third quartile, and the maximum values. A box plot will occupy the space from the first quartile to the third quartile, and as a result, it will span approximately 50 percent of the data range from the minimum value to the maximum one. The lowest 25 percent and the highest 25 percent spaces are not in the box. The horizontal line in the box stands for the median. The higher this line is, the better the quality will be. Moreover, the more flattened the box, the more consistent the values.
We use the same approach as in the analysis of the convergence curves, which means that we choose two typical cases for each data set type. Fig 4 shows these charts. It is easy to see that BSSA3 and BSSA4 cannot compete with BSSA1 and BSSA2. Their boxes are thicker, which means the outputs are not stable. In other words, the differences among best total values obtained after 30 runs of these algorithms are significant. In most cases, even the maximum best value after 30 runs achieved by BSSA3 and BSSA4 is not close to the minimum best value obtained by BSSA1 and BSSA2. This magnifies the preeminence of BSSA1 and BSSA2. The same goes for other tests. If we have a closer look at the boxes provided by BSSA1 and BSSA2, it is fair to conclude that BSSA2 is slightly better than BSSA1 in terms of stability and solution quality.

Wilcoxon rank sum test
The Wilcoxon rank-sum test [39] is a non-parametric hypothesis test that is used to evaluate whether the distributions of populations obtained from two separate sources are with the same

PLOS ONE
medians or not. In this subsection, Wilcoxon rank-sum tests are implemented to assess the differences among the solutions returned by our proposed algorithms. Specifically, Table 3 displays the p values we obtained when testing the solution sets given by BSSA1 against those of BSSA2, BSSA2, and BSSA4, respectively. Note that there exists a default significance level α = 0.05. In case p � α, there is not enough statistical evidence to confirm that the difference between the compared populations is significant. Otherwise, it can be concluded that the dissimilarity among the two related sets of values is notable.
Based on the statistical results in Table 3, we can conclude that the solutions given by BSSA1 are significantly different from the solutions returned by BSSA3 and BSSA4. The situation between BSSA1 and BSSA2 is more complicated. There are 21 times p exceeds the 0.05

PLOS ONE
threshold, while in the remaining 19 times, p is less than 0.05. In another word, in 52.5 percent of the tests, the difference among the solutions given by BSSA1 and BSSA2 is clear, while we can not statistically differentiate them in 47.5 percent cases. In short, Wilcoxon rank sum tests reaffirm what we have observed in previous subsections: BSSA1 and BSSA2 are at the same level and both of them are superior to BSSA3 and BSSA4.

Friedman test and Nemenyi post-hoc test
This subsection presents the results of the Friedman test [40][41][42] to provide an additional statistical perspective. Friedman test is a non-parametric test to replace the Repeated Measures ANOVA test [43]. The input parameters for this test are three or more populations, and the test will return a single conclusion after comparing these populations in its way. The null and alternative hypotheses of this test are: • H 0 : The mean values of the populations are similar.
• H a : At least one population mean is different from the mean values of the rest.
Again, the significance level α = 0.05 is applied. Suppose the p-value returned by the Friedman test is less than or equal to α. In that case, the conclusion will be that the null hypothesis is rejected and the alternative hypothesis is confirmed. Otherwise, the null hypothesis will be accepted.
We The results show that at least one population mean is significantly different from the rest. To clarify which algorithm's solution population this conclusion is for, we perform the Nemenyi post-hoc test [44]. This test will help answer the question of which population is genuinely distinct. This test returns a table containing the results of pairwise tests. Table 4 shows p-values of this test.
The results in Table 4 show that, when comparing BSSA1 with BSSA2, the p-value is 0.9. When comparing BSSA1 with BSSA3, the p-value is 0.001. The result is the same when comparing BSSA1 with BSSA4. When comparing BSSA2 with BSSA3 and BSSA4, the p-values are the same and equal to 0.001. The p-value when comparing BSSA3 with BSSA4 is 0.00299. When assessing these values with a significance level of 0.05, it can be seen that BSSA1 and BSSA2 have similar populations of mean total values, and they are significantly different from those of BSSA3 and BSSA4. If we consider the case of BSSA3 and BSSA4, they are also considerably different, although this difference is not as significant as the difference when compared with BSSA1 and BSSA2.
In summary, the Friedman and Nemenyi tests show that the results of BSSA1 and BSSA2 are not significantly different, while they are substantially different from those of BSSA3 and BSSA4.

Comparison to other algorithms
In this subsection, we compare BSSA1 and BSSA2 with five other algorithms for DKP01. The first is an evolutionary algorithm, FirEGA [8], the second is a swarm intelligence-based one,  [14]. Additionally, the best algorithms proposed in [12] and [10], MS1 and MMBO, are also included in the comparison. Note that the two latter algorithms were not tested in inverse correlated data sets, and the related papers did not provide data in some criteria. Anyway, the most important results are available, and they help us in this comparison phase. The results from Improved Harris hawk optimizer (IHHO) [19], a recently published work, are also included in the statistical tables. Although the authors of this algorithm only tested on some representative data sets, we believe that their results help further clarify where our algorithm stands. Tables 5-8 are used to show the test results. These tables include the statistical calculation results of the total values of the solutions returned after 30 runs of each algorithm. Specifically, column Instance shows the name of the instance tested. Column OPT stores the optimum value, and column Algorithm specifies the algorithm name. At the same time, Best, Average, and Worst present the best, average, and worst values. Meanwhile, Stdev indicates the standard deviation, and Gap reveals the gap between the average and optimum values. Specifically, the Gap value is calculated as specified in the below expression: where AVE stands for the average result. Table 5 shows the superiority of our proposed algorithms over other candidate solutions when they are tested with inverse strongly correlated instances. They lead the ranking table in almost all of the cases. The only circumstances when other algorithms raise their voices are the case of IDKP1 when FirEGA has the same best value as our algorithms, and the case of IDKP3 when IHHO has the best results in Best and Average categories. In general, BSSA1 takes the top place 27 times, while BSSA2 has 36 times on this aspect. It is also worth mentioning that there are 14 times when BSSA1 and BSSA2 share the top rank. Generally, our proposed algorithms lead comfortably in the tests using this instance type.
In the case of strongly correlated instances, the situation has changed. Table 6 stores data related to these tests. BPSO8 proves that it adapts very well to this test by leading the ranking table 19 times in total. The results also show that BSSA1 leads 14 times, BSSA2 does the same 7 times, while IHHO, MS1 and FirEGA step aside in every aspect. It is necessary to note that while there are 10 strongly correlated instances, BPSO8 leads on the gap value in all these 10 times. Their average fitness value is closer to the optimum value than that of the opponents. It also means that, with the remaining 9 times claiming the top place, BPSO8 is not superior to BSSA1 and MMBO, whose numbers are 14 and 8, respectively. Another interesting fact is that no top spot is shared among the tested algorithms in this type of instance.
Moreover, BPSO8 seems to be truly better in attaining the best solution, with 5 times at the top of the table, while the remaining 5 times are taken by MMBO and BSSA1 (3 for MMBO and 2 for BSSA1). It is rather equal when we find the best one in terms of the best average result, when BPSO8, MMBO, and BSSA1 reach the first position 3 times each. When searching for the best candidate by comparing the worst outputs of the tested algorithms after 30 runs, our proposed BSSA1 finishes top 4 times while the closest opponents, MMBO and BSSA2, another variant of our algorithm, reach the top spot 2 times. For MS1 and MMBO have no data on Stdev and Gap, if we exclude rankings on these columns, BSSA1 and BPSO8 have similar overall performance in strongly correlated instances, while MMBO finishes third, not so far behind. In terms of standard deviation, our proposed algorithms lead in all 10 instances, which proves that their returned solutions are more consistent than those from the others. In another approach, BPSO8 has the best average ranking in the Best and Gap categories, and BSSA1 comes out on top in the Average, Worst, and Stdev categories. Uncorrelated data sets are where the values and the costs are not related much, and it is interesting to see how tested algorithms perform in this type of distribution. Table 7 gives data on the performances of the algorithms in solving these instances. It is a rather equal performance when BPSO8 and BSSA1 take the top rank 23 times each. Interestingly, while BPSO8 is unbeatable in all 10 instances when we look at the gap values, its standard deviation performance is not that great. Furthermore, the Stdev values of our proposed BSSA1 and BSSA2 are much lower than those from BPSO8. Hence, since the returned solutions are concentrated close to the expected value, we can conclude that BSSA1 and BSSA2 are more stable than BPSO8, whose solutions are more dispersed. Finally, IHHO, FirEGA, MS1, and MMBO are not really in good form with this type of DKP01 problem with no wins. In terms of average ranking, it is interesting that BPSO8 and BSSA1 lead in the same categories as in the case of strongly correlated instances.
Weakly correlated instances are where MMBO has its voice. Table 8 shows that MMBO gets the first rank in terms of best solutions 5 times. These tests also prove the solid performance of our proposed algorithms. BPSO8 performs best on gap values with 10 wins, while IHHO and FirEGA struggle with every aspect of the tests with zero wins. If we count only the total times claiming the winner spot in the Best, Average, and Worst categories, MMBO has 9 times, BPSO8 has 4 times, BSSA2 has 6 times while BSSA1 is the winner with 10 first places. Our BSSA1 variant dominates most of the categories in terms of average rankings. Table 9 summarizes the performance of the tested algorithms through their average rankings. Note that because the results of IHHO are available just for 3 instances for each data set type, their average ranking values are calculated by total ranking value divided by 3 for each data set type. Additionally, because the data sets used in the experiments are very different, we provide the statistical results by data set type to see how algorithms perform with each of them. The lower the average rank is, the better the algorithm operates. Our proposed algorithms, BSSA1 and BSSA2, outperform other algorithms in inverse strongly correlated instances. In  the case of strongly correlated and uncorrelated instances, although BPSO8 has the best average best ranks, it cannot repeat that performance level in other factors. After analyzing all the factors, it can be seen that BSSA1 achieves the best average rankings in these two data sets. What happens with the weakly correlated data sets is the repetition of what can be seen with inverse strongly correlated instances, where our algorithms have a big difference compared to others in terms of average rankings. In summary, even though the data sets have significant differences, our proposed algorithms still have good adaptability and give higher quality solutions than other algorithms.

Computational cost
In this subsection, we will provide an overview of our proposed algorithm's running time compared to BPSO8. Since PSO is very popular and widely used, we decided to use it as the benchmark algorithm in this test. This comparison allows us to see the computational time of our BSSA2 algorithm compared to a widely recognized optimization algorithm. To perform this assessment, we run the algorithms on uncorrelated instances (UDKP1-UDKP10). Each algorithm will run 30 times on each instance, then the average running time of 30 runs is calculated. Test results show that our algorithm needs more time to finish a run. This could be due to the differences in the operations of the global search algorithm. Anyway, the time required is still acceptable. Table 10 provides the results of this test.

Conclusions and future work
Discounted {0-1} knapsack problem (DKP01) is not just a theoretical problem but also a principle widely applied in real life. Therefore, finding an effective way to solve this problem will help in real-world business and real-time decision-making systems. Using metaheuristics to solve NP-hard problems, our paper proposes and evaluates a new optimization algorithm with four variants based on the salp swarm algorithm that integrates many new techniques and results in better solution quality. This quality is worth the additional computational cost. The new algorithm is also more stable in producing good solutions than existing ones. Although the performance of our algorithm is optimistic, some aspects can be further studied in the future. Firstly, in the current approach, while SSA is responsible for covering the search space, the repair operator is responsible for correcting errors that the solutions provided by SSA might make and optimizing it in a predetermined strategy. The problem is that SSA's exploration capability is somewhat limited, and tweaks are needed to make the global search capacity stronger. Simply put, the current mechanism makes this algorithm very powerful in exploiting a certain direction as well as searching the neighborhoods of the salps in the chain. However, if the algorithm is modified and improved reasonably, later salps can significantly contribute to the exploration process. Improving this property will make the algorithm more powerful and flexible. Next, the repair operator can also be improved. We will be testing various options to make the repair operator work even better, such as defining a new item partitioning scheme. It is also important to note that the sizes of the problem instances in the test data sets are only from 100 to 1,000 dimensions. In the case of more complex data sets, such as 10,000 or 100,000 dimensions or even more, current algorithms for DKP01 will reveal their weakness in terms of computational cost. That's also an approach we plan to focus on, specifically developing parallel versions of the algorithm that take advantage of the computing power of next-generation CPUs and GPUs and reduce the computational cost.